Prototypical Cross-Attention Networks For Multiple Object Tracking And Segmentation